OCLoptimizer: An Iterative Optimization Tool for OpenCL

نویسندگان

Jorge F. Fabeiro

Diego Andrade

Basilio B. Fraguela

چکیده

Nowadays, computers include several computational devices with parallel capacities, such as multicore processors and Graphic Processing Units (GPUs). OpenCL enables the programming of all these kinds of devices. An OpenCL program consists of a host code which discovers the computational devices available in the host system and it queues up commands to the devices, and the kernel code which defines the core of the parallel computation executed in the devices. This work addresses two of the most important problems faced by an OpenCL programmer: (1) hosts codes are quite verbose but they can be automatically generated if some parameters are known; (2) OpenCL codes that are hand-optimized for a given device do not get necessarily a good performance in a different one. This paper presents a source-to-source iterative optimization tool, called OCLoptimizer, that aims to generate host codes automatically and to optimize OpenCL kernels taking as inputs an annotated version of the original kernel and a configuration file. Iterative optimization is a well-known technique which allows to optimize a given code by exploring different configuration parameters in a systematic manner. For example, we can apply tiling on one loop and the iterative optimizer would select the optimal tile size by exploring the space of possible tile sizes. The experimental results show that the tool can automatically optimize a set of OpenCL kernels for multicore processors.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Generation of Optimized OpenCL Codes Using OCLoptimizer

The eruption of multicore processors and several kinds of accelerators has generalized the interest in parallel programming. The OpenCL standard is very appealing because it provides code portability across most of these platforms. It defines a programming model where a host code requests the execution of kernels in computational devices. Unfortunately, the host API of OpenCL is quite verbose, ...

متن کامل

Optimizing OpenCL Kernels for Iterative Statistical Applications on GPUs

We present a study of three important kernels that occur frequently in iterative statistical applications: K-Means, MultiDimensional Scaling (MDS), and PageRank. We implemented each kernel using OpenCL and evaluated their performance on an NVIDIA Tesla GPGPU card. By examining the underlying algorithms and empirically measuring the performance of various components of the kernel we explored the...

متن کامل

OpenCL-Based Design of an FPGA Accelerator for Phase-Based Correspondence Matching

This paper proposes a Field Programmable Gate Array (FPGA) implementation of the stereo correspondence matching using Phase-Only Correlation (POC). The use of high-accuracy stereo correspondence matching based on POC makes it possible to measure accurate 3D shape of an object using stereo vision. The drawback of the POC-based approach is its high computational cost. To address this problem, we ...

متن کامل

Parallel local search on GPU and CPU with OpenCL Language

Real-world optimization problems are very complex and NP-hard. The modeling of such problems is in constant evolution in term of constraints and objectives and their resolution is expensive in computation time. With all this change, even metaheuristics, well known for their efficiency, begin to be overtaken by data explosion. Recently, Thanks to the publication of languages as OpenCL and CUDA, ...

متن کامل

Iterative statistical kernels on contemporary GPUs

We present a study of three important kernels that occur frequently in iterative statistical applications: Multi-Dimensional Scaling (MDS), PageRank, and K-Means. We implemented each kernel using OpenCL and evaluated their performance on NVIDIA Tesla and NVIDIA Fermi GPGPU cards using dedicated hardware, and in the case of Fermi, also on the Amazon EC2 cloud-computing environment. By examining ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

OCLoptimizer: An Iterative Optimization Tool for OpenCL

نویسندگان

چکیده

منابع مشابه

Automatic Generation of Optimized OpenCL Codes Using OCLoptimizer

Optimizing OpenCL Kernels for Iterative Statistical Applications on GPUs

OpenCL-Based Design of an FPGA Accelerator for Phase-Based Correspondence Matching

Parallel local search on GPU and CPU with OpenCL Language

Iterative statistical kernels on contemporary GPUs

عنوان ژورنال:

اشتراک گذاری